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/. Introduction 

A. Ribosomat and Nonribosomal Peptide 
Synthesis 

Significant progress has been made in the past four 
decades toward understanding the structures and 
synthesis of bioactive peptides produced by microor- 
ganisms through the ribosomal and the nonribosomal 
mechanisms. 1 " 8 Some peptides, such as the [antibi- 
otics (which contain the thioether amino acid lanthio- 
nine) belong to a group of highly stable multicyclic 
peptide antibiotics that are of ribosomal origin. They 
are synthesized through proteolytic processing of 
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gene-encoded precursors that have undergone several 
posttranslational modification events such as dehy- 
dration and addition of neighboring sulfhydryl groups 
to form thioethers. 3 - 4 - 9 Prototype peptides of this 
group are nisin, subtilin, and epidermin. They act 
primarily on Gram-positive bacteria and most serve 
as food preservatives. A vast array of other natural 
peptides with remarkable structural diversity (Figure 
1) produced by microorganisms living in different 
habitats, spread from aquatic to terrestrial environ- 
ments, are not gene encoded but are synthesized 
nonribosomally on large multifunctional enzymes 
called peptide synthetases. 1 - 25 " 8 The component 
moieties of these special metabolites are activated in 
the form of adenylate, acylphosphorylate, or coen- 
zyme A derivatives, before they are linked together 
to form the final products. It is now accepted that 
this nonribosomal peptide synthetic route is an 
alternative means of manufacturing highly special- 
ized polypeptides. 

B. Template-Directed Peptide Synthesis 

Some microorganisms contain multienzyme com- 
plexes that build specific protein templates for a 
nucleic acid-independent biosynthesis of low molec- 
ular weight peptides of diverse structures and broad 
spectrum of biological activities. 2 In this nonriboso- 
mal mechanism of peptide synthesis, compounds such 
as lipopeptides, depsipeptides, and peptidolactones 
are assembled from an exceedingly diverse group of 
precursors (to date more than 300 are known 10 ) 
including pseudo, nonproteinogenic, hydroxy, N- 
methylated, and D-amino acids (Table 1). In contrast, 
the nucleic acid-dependent ribosomal synthesis of 
peptides and proteins is restricted to the incorpora- 
tion of only 21 proteinogenic amino acids (including 
selenocysteine 11 " 13 ). Nonribosomal protein template- 
directed synthesis of peptides is only limited by the 
length of the peptide chain formed, which has been 
found to range from 2 to 48 residues. 2 - 6 However, the 
peptide backbone of these short bioactive peptides 
can be composed of linear, cyclic, or cyclic branched 
structures that can be further modified by acylation, 
glycosylation, or heterocyclic ring formation (Table 
2). These structurally diverse compounds are en- 
dowed with a broad spectrum of biological properties 
including antimicrobial, antiviral or antitumor activi- 
ties. 2 * 944 Others express immunosuppressive or en- 
zyme-inhibiting activity. Thus, members of this 
important class of peptide secondary metabolites 
have found widespread use in medicine, agriculture, 
and biological research. On the other hand, their 
physiological role in the metabolism of the source 
organisms has been the subject of considerable 
speculation. 14 These range from being signal mol- 
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ecules for coordination of growth and differentiation 
in the producers (most of them are spore-forming soil 
inhabitants), 15 " 18 evolutionary relics or breakdown 
products of cellular metabolism, 19 to defense weapons 
that kill other competitor microorganisms. 9 

Although structurally diverse, most of these bio- 
logically active peptides share a common mode of 
synthesis, the multienzyme thiotemplate mechanism 
(Figure 2L 5 - Q 2021 According to this model, peptide 
bond formation takes place on multienzyrnes desig- 
nated peptide synthetases, on which amino acid 
substrates are first activated by ATP hydrolysis to 
the corresponding adenylate. This unstable inter- 
mediate is subsequently transferred to another site 
of the multienzyme where it is bound as a thioester 
to the cysteamine group of an enzyme-bound 4'- 
phosphopantetheinyl (4'-PP) cofactor. 7 * 22 " 24 

Recently, it has been shown that peptide syn- 
thetases, like fatty acid synthases and polyketide 
synthases, require posttranslational modification to 
become catalytically active. 24 " 26 The inactive apo- 
proteins are converted to their active holoforms by 
posttranslational transfer of the 4'-PP moiety of 
coenzyme A to the side chain of a highly conserved 
serine residue located in peptide synthetases at the 
C-terminal region of each substrate activating unit, 
a region recently defined as the acylation or thiolation 
domain (see IV). 

At this stage, the thiol-activated substrates can 
undergo modifications such as epimerization or N- 
methylation. 62728 Thioesterified substrate amino 
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through a step-by-step elongation by a series of 
transpeptidation reactions. 6 " 8 - 2 ? These occur by trans- 
fer of the thioester-activated carboxyl group of one 
residue to the adjacent amino group of the next 
amino acid, thus effecting N to C stepwise assembly 
of the peptide product. During this condensation 
process all intermediates are covalently attached to 
the multienzyme complex. In conclusion, as shown 
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cyclosporin A 



HC-toxin 



enniatin B 



Figure 1. Chemical structure of some bacterial (gramicidin S, surfactin, bacitracin A, and tyrocidine Ai and fungal (HC- 
toxin, enniatin, cyclosporin, and isopenicillin) peptide antibiotic whose peptide-bound backbones are synthesized by the 
nonribosomal thiotemplate mechanism. Genes encoding the involved peptide synthetases are shown in Figure 3. 



in Figure 2, the 4'-PP cofactors facilitate the ordered 
transfer of the carboxy-activated thioester substrates 
between the active units that constitute the peptide 
synthetases, resulting in the formation of a peptide 
of defined sequence. 

Protein chemical studies and the recent progress 
in cloning and sequencing of genes encoding peptide 
synthetases of bacterial and fungal origin provide 
valuable insights into the molecular architecture of 
these enzymes. 29 ' 42 A modular structure for these 
multienzyme complexes has emerged, in which the 
substrate activating/modifying units are aligned in 
a sequence that is colinear with the amino acid 
sequence of the assembled peptide. 5-7 - 23 These units 
have been designated as modules according to a 
definition originally applied by L. Katz and co- 
workers to the arrangement of genes encoding type 
I polyketide synthases. 43 On the basis of comparison 
of DNA sequences encoding several peptide syn- 
thetases and recent studies on heterologous expres- 
sion of DNA fragments 28 - 44-49 that encode proteins 
that activate individual amino acids, modules were 
defined as semiautonomous units within peptide 
synthetases that carry all information needed for 
recognition, activation, and modification of one sub- 
strate. This means that the number of modules and 
their order within a peptide synthetase define the 
sequence and the length of the synthesized peptide 
(Figure 2A, B). The modules, although proposed to 
act independently of each other, have to work in 



concert during peptide elongation. This template- 
based mode of action, in which different 4'-PP pros- 
thetic groups (one for each module) are involved in 
peptide and depsipeptide bond formation, has been 
designated the multiple carrier thiotemplate mech- 
anism. It is now an universally accepted model for 
nonribosomal peptide synthesis. 6 " 8 23 

//. Genes Encoding Modular Peptide Synthetases 

A large number of bacterial operons and fungal 
genes encoding peptide synthetases have recently 
been cloned, sequenced, and partially charac- 
terized. 29 " 42 - 50 Different cloning strategies were used, 
including probing of expression libraries by antibod- 
ies raised against peptide synthetases, complemen- 
tation of deficient mutants, and the use of designed 
oligonucleotides derived from amino acid sequences 
of peptide synthetase fragments. 44 51 52 Recently, 
utilization of the polymerase chain reaction (PCR) 
technology to amplify specific sequences from ge- 
nomic DNA, by using degenerate oligonucleotides 
corresponding to highly conserved motifs in peptide 
synthetases (see section IV. A), established a conve- 
nient general approach for the identification and 
cloning of putative genes encoding these multien- 
zymes. 51 - 52 

The complete DNA sequences of several bacterial 
operons, including grs? 2 53 srfA* 3 37 *vc, 41 54 and 6ac 42 
for the biosynthesis of the cyclic peptide antibiotics 
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Table 1. Nonproteinogenic Constituents fP ptide Antibiotics (Examples for Some Unusual Moieties) 



Name 



Structure Abbreviation System(s) Organising) 



modified, proteinogenic amino acids 
JV-methyl aa (e.g. A^-methyi valine) 

D-aa (e.g. D- phenyl alanine) 



MeVal cyclosporin 
enniatin 



e.g. bacitracin 
D-Phe gramicidin S 

tyroadine 



Tolypocladium niveum 
Fusarium scirpi 



Bacillus licheniformis 
Bacillus brevis 
Bacillus brevis 



non-proteinpgenic amino acids 

6-(L-a-amino adiptc acid) 



2-amino-9, 10-epoxy- 

8-oxodecanoic acid 



L-a-amino butyric acid 



(4R)-4{(E)-2-butenyl. 

4-methyl-L- threonine] 



2,6-diamino-7-hydroxy- 
azealic acid 



ornithine 

carboxy acide 

2,3-dihydroxy benzoic acid 
D-a- hydroxy isovaleric acid 

amines 

spermidine 



ACV-tripeptide Penicillium chrysogenum 

Aad (precursor of penicillin Aspergillus nidulans 

and cephalosporin) Streptomyces clavuligerus 



HC-toxin 



cyclosporin 




cyclosporin 




HaN* O* 



OH O 



HO 



e.g. bacitracin 
Orn gramicidin S 

tyrocidine 



Dhb enterobactin 



Hiv enniatin B 



Cochliobolus carbonum 



Tolypocladium niveum 



Tolypocladium niveum 



Bacillus brevis 



Bacillus licheniformis 
Bacillus brevis 
Bacillus brevis 



Escherichia coli 



Fusarium scirpi 



OH 



*NHi Sperm edeine 



Bacillus brevis 



gramicidin S, surfactin, tyrocidine, and bacitracin, 
respectively, have been determined. These operons 
span regions of 18—45 kb (Figure 3) and encode 
several peptide synthetases comprising one to six 
modules, respectively. In the bacterial system the 
ncoded multi enzymes range in size between 126 kDa 
for one module enzymes (GrsA, TycA) to over 700 kDa 
for the six modules of tyrocidine synthetase 3 (TycC). 
A minimal module for substrate adenylation and 
thiolation contains two distinct domains: 24 - 32 - 49 the 
adenylation domain (Figure 3, red region, about 550 
residues) and the thiolation domain (green region, 
about 100 residues). Several gene fragment-encoding 
adenylation domains of different modules have been 
amplified by PCR and expressed in heterologous 
systems. The overproduced proteins were shown to 



be active in substrate recognition and adenylation, 
but devoid of thiolation activity. 28 41 49 56 The thiola- 
tion domain of TycA, designated PCP for peptidyl 
carrier protein (Figure 3c, green region, see section 
IV.B), was also independently expressed and shown 
to be active in acylation reaction after posttransla- 
tional modification with the cofactor 4'-PP. 24 56 

In the fungal systems, exemplifi d by acvA, 29 ' 31 - 57 58 
htsl,* 6 esynl** and cssA 35 genes, which encode the 
templates that direct the synthesis of the tripeptide 
d^LKX-aniinoadipyl)-L-cysteinyl-D- valine (lld-ACV, the 
precursor molecule of isopenicillin N), HC-toxin of the 
maize pathogen Cochliobolus carbonum, the dep- 
sipeptide enniatin of Fusarium scirpi, and the im- 
munosuppressive cyclic peptide cyclosporin A of 
Tolypocladium niveum (Figures 1 and 3e— h), respec- 
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Table 2. Structural Details of Peptide Antibiotics (Exam pi s for Some Modifications at the Peptide Main Chain) 



Name 



Structure 



System(s) 



Organism(s) 



ring formation 

cyclization 



branching 

(amide bond) 



branching 

(ester bond) 




thiazoline ring formation 

(between lie and Cya) 



b- lactam ring formation 

(between Cys and D-Val) 



' *NH» 



e.g. cyclosporin 
gramicidin S 
tyroddine 



bacitracin 



surfactin 



bacitracin 



Tolypocladium niveum 
Bacillus brevis 
Bacillus brevis 



Bacillus licheniformis 



Bacillus subtil is 



Bacillus licheniformis 



A C V- tripe p tide Penicillium chrysogenun 

(precursor of penicillin Aspergillus nidulans 
and cephalosporin) Streptomyces clavuligerx 



additions (fattv acids and carbohydrates) 
acylation 




OH O B 



glyco?ylation 




surfactin 



Bacillus subtilis 



vancomycin 



Streptomyces oriental is 



tively, the peptide synthetases involved, are with ut They range in size from 350 kDa for a two module 
exception, integrated multienzymes. All modules are enzyme like Esynl to a molecular mass over 1600 
aligned on a single polypeptide chain (Figure 3). kDa for the 11 modules containing cyclosporin A 
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<enxyme-**«ooated) (enzyme-bound) 

J | thictrtion "| I . 

D-Phe-L-Pro-S-Pant-thiolester D-Phe-S-Pant-thiolester 

(eniyme-bound) * ~ (enxym*-feound) 

} oondcDsatno I 



o-n»/-i ^ — — ^ **■ » 



GrmA GrmB 

| DPbe -iFPro — V>1— Ora — L eu T 

[Teu — Pro «— Val ^ Pro 4 f D- Ph« I 
GrmB GrmA 

gramicidin S 

Figure 2. A simplified scheme displaying the principles 
of the thiotemplate-directed nonribosomal peptide synthe- 
sis. (A) For example, the synthesis of the cyclic deca peptide 
gramicidin S on the multifunctional enzymes GrsA (one 
amino acid-activating module plus epimerization domain) 
and GrsB (four amino acid-activating modules) is shown. 
Each module (symbolized by a circle) activates the cognate 
amino acid by ATP hydrolysis as amino acyl adenylate. 
This relatively instable intermediate is stabilized by thioes- 
terification on the cofactor 4'-PP. Thioesterified substrates 
are then integrated into the growing peptide through a 
step-by-step condensation. The amino acid-activating mod- 
ules are arranged in the order that corresponds to the 
amino acid sequence of the peptide. The arrows indicate 
the direction of polymerization. (B) Structure of the peptide 
antibiotic gramicidin S. The cyclic decapeptide was ob- 
tained by a head-to-tail condensation of two identical 
pentapeptides synthesized by the protein template de- 
scribed above. 

synthetase. The cssA gene is over 45 kb in length, 
represents the largest gene known and encodes a 
single polypeptide chain of over 15000 amino acid 
residues. 35 In general the modular arrangement and 
the domain structure of fungal peptide synthetases 
is very similar to that of bacterial enzymes (Figure 

As mentioned above, a minimal module contains 
an adenylation and a thiolation domain and com- 
prises about 650 amino acid residues. This minimal 
size is increased when additional functional domains, 
e.g., for epimerization (Figure 3, blue regions) or 
TV-methylation (yellow regions) are integrated. While 
the epimerization domains (about 430 amino acid 
residues) in bacterial and fungal peptide synthetases 
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were found to be contiguously integrated downstream 
of the thiolation domains, 29 - 31 - 33 - 37 - 41 - 42 - 53 the N-rneth- 
ylation domains (about 420 amino acid residues), 
which were only found in the fungal enniatin and 
cyclosporin A synthetases, are located between the 
adenylation and thiolation domains (Figure S^h). 34 35 
The significance of this domain arrangement within 
the different modules and its influence on the syn- 
thesis of peptides are matters of speculation. Ad- 
ditional biochemical studies on dissected domains 
and the analysis of their activities and interactions 
in vitro may shed light on this multidornain arrange- 
ment. 

Between the different modules constituting the 
bacterial and fungal templates for the generation of 
a defined peptide, one would expect specific intra- 
and/or intermolecular interactions. These interac- 
tions are not only needed for acyl and peptidyl 
transfer reactions but also for the correct channeling 
of the peptide product. Putative regions within 
modular peptide synthetases, designated condensa- 
tion domains (Figure 3, white regions; see section 
IV.C) are believed to be the sites of such specific 
communication. 5 ~ 7 - 59 These domains are located 
upstream of most internal adenylation domains (red 
regions) and seem to be associated with the peptide 
elongation reaction since their occurrence corre- 
sponds to the number of peptide bonds in the derived 
peptide product. Moreover, these domains are absent 
from peptide synthetase modules that are involved 
in initiation reactions, such as the gramicidin S 
synthetase 1 (GrsA) 53 and the tyrocidine synthetase 
1 (TycA). 54 In both of the latter cases, no putative 
condensation regions are present upstream of the 
adenylation domains (Figure 3a, c). Nothing is 
known about the exact mechanism of peptide elonga- 
tion in the nonribosomal system, nor is it known how 
modules interact and how this interaction may affect 
the direction of polymerization. 5960 However, all 
peptide intermediates remain covalently attached to 
the protein template during the elongation reac- 
tion. 2 - 5 " 12 1 60 Termination of nonribosomal peptide 
synthesis is either initiated by the action of thio- 
esterases (see section IV.D), the transfer of the 
peptide chain to another functional group, or by 
cyclization. 2 - 6 - 21 - 61 

///. Genes Associated with Nonribosomal Peptide 
Synthesis 

Other associated genes, whose products affect 
nonribosomal peptide synthesis, have been identified 
within bacterial operons encoding peptide synthetas- 
es. Those located inside the operons were found to 
encode proteins that show significant homology to 
fatty acid thioesterases of type II. The genes encod- 
ing these thioesterase-like proteins are either located 
at the 5'-end or the 3'-end of the biosynthetic operon 
(Figure 3a-c, light pink regions). 33 - 41 - 53 The encoded 
proteins (25-29 kDa) show over 307<r identity and are 
believed to be important but not essential for the 
synthesis of the corresponding peptide ( see section 
IV.D). 62 

Another class of genes (gsp, sfp, and bli, Figure 3a, 
b, and d, gray regions), associated with but not 
integrated in the bacterial operons of grs* 3 srfA, 64 - 65 
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o 



100 i 



minimal module 

(substrate recognition, 
adeoylation and thiolation) 



adenylation 



thiolation 



fin in 

condensation 




thioesterase 



epimerization 



JV-methylation 

Figure 4. Schematic diagram showing the building up of a peptide synthetase module on functional domains. The particular 
composition of a module depends on the given requirements in regard of substrate activation, elongation and modification. 
(See Figure 3.) With exception of the adenylation (A domain, red) and the thiolation (T domain, green) domains, which 
were both biochemically characterized, all other domains, such as condensation (C domain, gray), Af-methylation (M domain, 
yellow), epimerization <E domain, blue) and the thioesterase (TE domain, pink), were predicted from sequence alignments! 
Highly conserved signature sequences, which are also conserved in their relative locations, are shown. These core sequences 
are indicated in Table 3. 



and feac 42 66 were found to be essential for nonribo- 
somal peptide synthesis. 25 For example, disruption 
of the sfp gene, which is located about 4 kb down- 
stream of the 3'-end of srfA operon, caused complete 
inhibition of the production of the lipopeptide anti- 
biotic surfactin, although the expression of the pep- 
tide synthetases (Sr£A-A/-B/-C) was not affected. 64 - 65 
Surprisingly, the gsp gene of the grs operon, which 
encodes a 28 kDa protein that shows about 34% 
identity to Sfp, complements in trans the s/p-null 
mutation, indicating that Sfp and Gsp have similar 
functions in nonribosomal peptide synthesis. 63 Re- 
cently, it has been also shown that 6/i, located 
downstream of the bacitracin biosynthetic operon 
(Figure 3d) and homologous to Sfp and Gsp, restores 
surfactin production in the s/p-null mutant to a 
normal level. 42 - 66 These genetic studies clearly indi- 
cate that Gsp, Sfp, Bli, and the EntD protein of 
Escherichia coli (needed for synthesis of the sidero- 
phore enterobactin) are members of a new protein 
family that is associated with the synthesis of sec- 
ondary metabolites. Recently, Lambalot and co- 
workers have shown that these proteins display 4'- 
PP transferase activity and are responsible for the 
posttranslational modification of the corresponding 
peptide synthetases 25 These studies led to the 
discovery of a superfamily of such 4'-PP transferases 
(Table 5, see section IV.B) that are involved in the 
specific modification of 4'-PP-requiring enzymes, 
including fatty acid and polyketide synthases as well 
as several peptide synthetases from different species. 

IV. The Functional Domains of Modular Peptide 
Synthetases 

For a better understanding of the structure- 
function relationship of the building blocks of peptide 
synthetases, the modules, and how their functional 



domains are arranged, biochemical dissection studies 
and sequence alignments were undertaken. The 
structural features of putative domains and poten- 
tially important residues that might be involved in 
substrate specific adenylation (A domain), thiolation 
(T domain), epimerization (E domain), Af-rnethylation 
(M domain), elongation/condensation (C domain), and 
release of the thioester bound peptide chain (TE 
domain) are detailed below (Figure 4 and Table 3). 

A. Adenylation Domain 

The adenylation domains (A domain) represent the 
central points of action in multifunctional peptide 
synthetases. For each incorporated amino acid in the 
peptide product a specific adenylation domain exists, 
whose location also dictates the primary structure of 
the peptide product (see above and Figures 1 and 3). 
Hence, investigations on peptide synthetases have 
notably focused on the A domain in recent years. 

/. Activation Reaction 

In order to incorporate an amino acid residue into 
a peptide through the protein template a two-step 
mechanism (Figure 2) for substrate activation is 
required 2.5-7.20.21 First, the cognate amino acid is 
activated as aminoacyl-adenylate at the expense of 
Mg*+-ATP (Figure 5). Second, the enzyme-attached 
thiol moiety 4'-phosphopantetheine (4'-PP) attacks 
the aminoacyl adenylate to yield the aminoacyl 
thioester and AMP as leaving group. The second step 
of the reaction requires the presence of the thiolation 
domain (T domain), which will be discussed below. 

The way in which the amino acid residues are 
activated resembles that catalyzed by aminoacyl- 
tRNA synthetases in the ribosomal system of peptide 
synthesis. 5 - 6 - 21 - 67 - 69 There, the cognate amino acid is 
also activated as aminoacyl adenylate and then 
becomes esterified onto the 2'- or 3'-OH of the 
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Table 3. Highly Conserved Core Motifs f the 
Catalytic Domains of Peptide Synthetases 



domain 0 


coreis) 6 


consensus sequence 


adenylation 


Al 


L(TS)YxEL 




A2 (core 1) 


LKAGxAYUVL)P(LI)D 




A3 (core 2) 


LAYxxYTSG< ST)TGxPKG 




A4 


FDxS 




A5 


NxYGPTE 




A6 (core 3) 


GELxIxGxG<VL)ARGYL 




A7 (core 4) 


Y(RK)TGDL 




A8 (core 5) 


GRxDxQVKIRGxRIELGEIE 




A9 


LPxYM(IV)P 




A10 


NGK(VL)DR 


thiolation 


T (core 6) 


DxFFxxLGG{ HD )S( LI ) 


condensation 


CI 


SxAQxR( LM X WY )xL 




C2 


RHExLRTxF 




C3 (His) 


MHHxISDG(WV)S 




C4 


YxD(FY)AVW 




C5 


( IV)GxFVNT(QLKCA )xR 




C6 


(HN)QD(YV)PFE 




C7 


RDxSRNPL 


thioesterase 


TE 


G(HY)SxG 


epimerization 


El 


PIQxWF 




E2 (His) 


HHxISDG(WV)S 




e*o trace A) 


UxJLL»xAX(jr 




E4 (race B) 


EGHGRE 




E5 (race C) 


RTVGWFTxxYP( YV)PFE 




E6 


PxxGxGYG 




E7 (race D) 


FNYLGfQR) 


AT- me thy la ti on 


Ml (SAM) 


VUDE)GxGxG 




M2 


NELSxYRYxAV 




M3 


VExSxARQxGxLD 



0 See Figure 4. 6 Former nomenclature is given in brackets. 

3'-nucleotide of the corresponding tRNA, which acts 
as the carrier of the activated amino acid. Despite 
these similarities shared by the ribosomal and non- 
ribosomal system during amino acid activation, the 
enzymes involved have no similarity in primary and 
3D structures. 6870-73 Strikingly, there exist two 
classes of aminoacyl-tRNA synthetases, catalyzing 
the formation of aminoacyl adenylate, with funda- 
mentally different folding topologies. 69 74 The re- 
cently solved crystal structure of an adenylation 
domain of a peptide synthetase reveals that Nature 
has invented a third fold for the same reaction. 72 73 75 

2. Adenylation Domains of Peptide Synthetases Are 
Members of a Superf amity of Adenylate-Forming 
Enzymes 

The conserved region of the A domain was identi- 
fied by comparing several genes encoding peptide 
synthetases. 32 The highly conserved A domains were 
found as repetitive blocks, the number of which 
coincides with the number of amino acids activated 
by the corresponding synthetase. These blocks, con- 
nected by regions now designated the condensation 
domains (see section IV.C), 59 represent what we call 
the minimal module, containing the A and the T 
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domain. The A domain is about 550 amino acids in 
length. 49 It shares significant homology with the 
family of acyl-CoA synthases and luciferases, which 
are about the same size. Since all these enzymes 
catalyze an analogous reaction, the adenylation of 
their carboxy substrates, they constitute a superfam- 
ily of adenylate forming enzymes 32 A T domain 
connected to an A domain is exclusively found in 
peptide synthetases and is involved in the second 
part of the amino acid activation, the thiolation 
reaction (see section IV.B). 23 Taken together, these 
two domains exhibit a specific set of conserved motifs, 
the fingerprint of peptide synthetases, which has 
enabled the detection of previously unidentified genes 
encoding peptide synthetases by PCR. 51 52 Deletion 
studies on gramicidin S synthetase 1 (GrsA) defined 
the boundaries of the adenylation and thiolation 
activities associated with the A and T domains 
(Figure 4). 49 

3. The A Domains Are Enzymatically and Structurally 
Independent 

While the A domain of peptide synthetases is an 
integrated part of a multifunctional enzyme, the 
homologous acyl-CoA synthases are distinct pro- 
teins. 32 - 72 Following the idea of the modular archi- 
tecture of peptide synthetases, the question arose 
whether such a single A domain could function 
independently from the adjacent domains. The first 
insights were obtained through biochemical charac- 
terization of proteolytic and gene encoded fragments, 
expressed in E. coli, of the multimodular gramicidin 
S and tyrocidine synthetases, which exhibited activa- 
tion of one specific amino acid residue only. 28 44-48 76 ~ 78 
The catalytic independence of the integrated A 
domain itself was demonstrated for the first time by 
deletion studies on the starter synthetases GrsA and 
TycA (Figure 3). 49 55 The A domains, located in the 
N-terminal region, were proven to catalyze amino 
acid activation (Figure 5) with the same specificity 
as the wild-type enzymes. Very recently, it has also 
been shown that internal A domains of the multi- 
modular enzymes TycB and TycC can be expressed 
in E. coli as soluble, functional proteins. 41 These 
findings reinforce the idea of multimodular peptide 
synthetases being an assembly of structurally and 
functionally independent domains on a polypeptide 
chain that act in concert with respect to their order 
on this giant template. By swapping single domains 
within the template, novel peptide products may be 
produced (see section VI). 79 The technique of inves- 
tigating distinctly expressed internal domains opens 
up a way to decode the primary structure of a peptide 
product, for which the genes for the unknown tem- 
plate have been determined. 8081 
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Figure 5. Amino acid adenylation in peptide synthesis 
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Figure 6. Ribbon diagram of the adenylation domain 
PheA showing the large N-terminal domain and the small 
C-terminal domain. The substrates, AMP (red) and phen- 
ylalanine i orange), are drawn using a space-filling repre- 
sentation. The locations of the highly conserved core motifs 
<A1-A10> in the superfamily of adenylate-forming enzymes 
within the PheA structure are indicated. 

4. The Crystal Structure of the Phe-Activating Adenylation 
Domain of Gramicidin S Synthetase 1 

Very recently, the first peptide synthetase frag- 
ment, the A domain of GrsA obtained through C- 
terminal deletion and expression in E. coli (desig- 
nated PheA), has been crystallized and the 3D 
structure been solved at 1.9 A (Figure 6). 4975 The 
overall topology is highly similar to the structure of 
firefly luciferase, 72 73 although the two proteins are 
only 169J identical in their primary sequence. Thus, 
it can be assumed that the other members of the 
homologous superfamily of adenylate-forming en- 
zymes, i.e., the A domains of peptide synthetases, 
have a very similar structure. 

The A domain of GrsA is folded into two compact 
subdomains (to describe the crystal structure, the 
term domain means a stable tertiary fold), a large 
N-terminal and a smaller C-terminal portion, that 
are connected with a short hinge. Strikingly, the 
small C-terminal domain is rotated relatively to the 
N-terminal domain about 94° with respect to the 
structure of firefly luciferase. It remains unknown 
whether this rotation represents different stages of 
the catalytic mechanism, as the crystals of PheA 
contain the bound substrates L-phenylalanine, AMP, 
and Mg 2 ~ in contrast to firefly luciferase, whose 
structure was determined without substrates. 73 The 
smaller C-terminal domain is indispensable for the 
activity of the protein, since its deletion results in a 
complete loss of activity. 82 

The core motifs of the A domain are the best 
conserved short amino acid sequences throughout the 
superfamily of adenylate-forming enzymes (see Fig- 
ure 4 and Table 3>. 32 They have been the subject of 
extensive investigations in recent years. Due to their 
ubiquitous presence they were thought to be involved 
in the common reactions, i.e., ATP binding, hydrolysis 
and adenylation of the carboxylate moiety of the 
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substrate (Figure 5). Their location in the structure % 
of PheA and their interactions with the substrates 
will now be discussed and compared with the avail- 
able biochemical data (Figure 6). 

Almost all core motifs are positioned around the 
active site where the substrates are bound (Figure 
6, Al-10). Instead of the phenylalanine adenylate, 
the free amino acid and AMP are found in the 
structure: the adenylate has been hydrolyzed, and 
the pyrophosphate is missing. Most of the residues 
involved in substrate recognition are contributed by 
the larger N-terminal domain. However, a strictly 
conserved lysine residue (Lys51 7, Figure 6 and Table 
3, A10) of the C-terminal domain is involved in key 
interactions (see below). 

A signature sequence of the superfamily of adeny- 
late-forming enzymes, 83 TSGTTGKPKG (motif A3, 
see Table 3), is mostly disordered in the structure. 
However, its orientation and distance to the AMP 
suggest an interaction with the pyrophosphate leav- 
ing group. The three G residues of the motif A3 of 
tyrocidine synthetase 1 (TycA) were mutated to A and 
the P to V without significant effect on adenylation 
activity in any mutant. 84 Introduction of a negative 
charge by replacing the first G to D in the second 
adenylation domain of gramicidin S synthetase 2 
(GrsB) led to complete inactivation of the enzyme 
however. 85 Mutagenesis of the second K resulted in 
drastic reduction of activity, K to Q 61^,^ K to R 
90%, and K to T 99.5^ , 84 whereas a K to Q mutant 
of the first lysine had no significant effect on the 
activities of the valine-activating domain of surfactin 
synthetase B (SrfA-B) and TycA, respectively. 85 The 
side chain of the second lysine is poorly ordered in 
the structure of PheA and projects into the solvent, 
while the first threonine of the motif A3 interacts 
with the a-phosphate. 

The highly conserved core motif A7, Y(RK)TGDL 
(see Table 3 and Figure 6), which is observed in 
various ATPases, has also been investigated by 
means of mutagenesis.* 4 87 A mutation of D to N 
reduced activity to 78^ while a D to S substitution 
retained only 12<7( of wild-type activity. 84 In the 
structure of PheA the side chain of the aspartic 
residue, which is strictly invariant in all known 
members of the superfamily of adenylate-forming 
enzymes, interacts via hydrogen bonds with the 
oxygen atoms of the nucleotide ribose moiety. 

Further contacts with the nucleotide base are 
formed by the main-chain carbonyl atom of A322 
(numbering with respect to GrsA) and the side-chain 
carbonyl oxygen of N321 to the amino group of 
adenine, the latter N321 being well conserved and 
part of the core motif A5, NxYGPTE (Figure 6 and 
Table 3). A mutant strain of Bacillus brevis, deficient 
in gramicidin S production, was traced back to have 
a G to D substitution in this motif of the valine- 
activating domain GrsB-Val. 85 The main-chain car- 
fa nyl atom of this glycine points toward the a-amino 
group of the substrate phenylalanine. Other residues 
interacting with the amino group are 1330, via the 
main chain carbonyl oxygen, and the strictly con- 
served D235, a part of core motif A4, FDxS (Figure 
6). Consequently, D235 is only conserved in peptide 
synthetases that activate amino acids, in contrast to 
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luciferases and acyl-CoA synthases, the substrates 
of which do not have an a-amino group. 16 32 

The lysine residue of core motif A10, NGK (Figure 
6), binds to the a-carboxylate group of the substrate 
phenylalanine as well as the ribose oxygens 0-4' and 
0-5'. The strictly invariant lysine is thus involved 
in two key polar interactions with both the adenosine 
and the amino acid, presumably fixing their position 
in the active site and clamping the C-terminal 
domain in a certain orientation. The key role of this 
lysine is also confirmed by a K to Q mutation in the 
valine-activating domain of surfactin synthetase, 
which caused a reduction in activity of >90%, 86 * 88 and 
by its specific labeling with fluorescein 5'-isothiocy- 
anate. 88 In contrast to the other interactions dis- 
cussed above, the NGK motif is located in the smaller 
C-terminal domain of PheA. Another well-conserved 
sequence in this folding domain, GRxxxQVKIRGx- 
RIELGEIE (motif A8, see Figure 6), was shown to 
be essential for adenylation. Mutation of the second 
G to various residues led to a loss of activity in the 
proline-activating domain of GrsB. 78 Additionally, 
labeling studies with fluorescein 5'-isothiocyanate 
and the photolabel 2-azido-ATP suggested the par- 
ticipation of at least a part of this motif in the 
adenylation reaction. 88 " 90 The first arginine of this 
motif is another possible candidate to interact with 
the pyrophosphate. 

The core motifs A2, especially conserved in peptide 
synthetases, and Al are probably only conserved for 
structural reasons, as they are far away from the 
active site. Motif Al is part of a large helix, which 
significantly contributes to the fold of the N-terminal 
domain. The special functions of the core motifs A6 
and A9, both of which were labeled with 2-azido-ATP 
and therefore thought to be involved in adenylation, 89 
remain unclear. Nevertheless, they are also in 
proximity to the active site. 

For further details about the location of conserved 
residues the reader is referred to the paper dealing 
with the structure of PheA. 75 It can be concluded 
that the significance of most of the core motifs has 
been confirmed by the structure. A rotation of the 
small C-terminal domain might be the key to under- 
standing the course of the reaction, in particular with 
respect to the second part of the activation reaction, 
the formation of the thioester linkage. From the 
structure it is not clear in which direction the 
polypeptide chain will continue, i.e., the relative 
location of the following thiolation domain (see T 
domain). It is conceivable that highly conserved 
motifs that do not directly bind any of the substrates 
in PheA are important for interactions with the 
incoming 4'-phosphopantetheine arm. 

5. Specificity of Peptide Synthetases and the Amino Acid 
Binding Pocket of PheA 

Peptide synthetases are known to be of moderate 
substrate specificity compared to the aminoacyl- 
tRNA synthases. 91 " 93 In contrast to the high fidelity 
required of the latter, 68 69 94 there is no significant 
evolutionary pressure for accurate substrate recogni- 
tion by peptide synthetases to be expected. A de- 
pendence of amino acid incorporation on amino acids 
added to the growth media can be observed in many 
cases, e.g., cyclosporin A, enniatin, surfactin, and 
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tyrocidine synthesis. 91 -93.95- 100 Some adenylation 
domains exhibit a higher specificity than others, 41 a 
finding that may reflect positions of special impor- 
tance in the peptide products with regard to their 
mode of action, or simply the relative difficulties in 
discriminating against other amino acids. In in vitro 
studies, when the respective amino acid concentra- 
tions can arbitrarily be chosen, peptide synthetases 
can be forced to synthesize an even wider range of 
products, as demonstrated in the case of cyclosporin 

A, enniatin, and other peptide antibiotics. 

The residues forming the binding pocket for the 
amino acid substrate, and thus determining the 
specificity of an adenylation domain, have been of 
particular interest ever since the determination of 
peptide synthetases primary structures. Because of 
its relative inhomogenity, the region between core 
motifs A3 and A6 (see Figure 4) was thought to 
accommodate the binding pocket. 33 When sequences 
for this region were aligned, slight tendencies toward 
a clustering of A domains with the same specificity 
were observed. 3233 Nevertheless, a stronger effect 
was the superposition of the origin of the respective 
domains, i.e., the organism from which they were 
taken. 32 

The residues interacting with the a-amino and the 
a-carboxylate group of phenylalanine in the structure 
of PheA have already been described above. As 
determined from the crystal structure, all residues 
involved in building the hydrophobic pocket are 
located between the core motifs A3 and A6 (data not 
shown). The pocket is closed at the bottom by the 
indole ring of W239, on one side by A236, 1330, and 
C331, and on the opposite side by A322, A301, and 
T278. At one side of the pocket there is a water-filled 
channel that connects with the solvent. Considering 
the remarkably high homology in the 3D structure 
between firefly luciferase and PheA, 72 73 75 it can be 
expected that in other A domains of peptide syn- 
thetases (which show between 30-80% identity) 32 the 
binding pocket is formed from the equivalent resi- 
dues. Examination of these residues in multiple 
alignments of A domains (data not shown) reveals a 
correlation between the polarity of either the sub- 
strate and the residues forming the pocket. The field 
is now open for attempts to alter substrate specificity 
and thereby the structure of the peptide antibiotic 
by means of site-directed mutagenesis (see section 
VI). 

B. Thiolation Domain and Its Posttranslational 
Modification 

The thiolation domain (T domain) of peptide syn- 
thetases, also called peptidyl carrier protein (PCP), 
is the site of 4'-PP cofactor binding and substrate 
acylation. 22-24 27 ' 49 - 101 In analogy to the acyl carrier 
proteins (ACP) of modular fatty acid and polyketide 
synthases, the T domain of peptide synthetases is an 
integral part of these multi enzymes. 1 6 7 24 102-105 This 
functional unit of about 100 amino acid residues, to 
which aminoacyl substrates are bound as carboxy 
thioesters, is located in peptide synthetases directly 
downstream of the adenylation domains (A domains). 
An exception to this arrangement was found in the 
fungal modules activating Af-methylated amino acids 



BNSOOCID: <XP 2133489A_I_> 



2662 Chemical Reviews, 1997. Vol. 97, No. 7 



Marahiel et al.. 



Table 4. Enzyme Super-family of Acyl/Peptidyl Carrier Proteins (ACPs/PCPs): 
Highly Conserved Co factor 4'-PP Binding Site 



Sequence Alignment around the 



Enzyme 


Organism Position (aa) 
















Sequer 


C€ 


» 








A) Peptide synthetases 






























TycA 


Bacillus brevis 553 


D 


N 


F 


Y 


S 




G 


G H 8 I 


0 


A 


T 


Q 


V 


GrsB 


Bacillus brevis 2033 


D 


N 


F 


F 


E 


L 


G 


G H 8 L 


R 


A 


M 




M 


SrfA-B 


Bacillus subtilis 990 


D 


N 


F 


F 


M 


I 


G 


G H 8 L 


K 


A 


M 


M 


M 


AcvA 


Penicillium chrysogenum 3049 


D 


D 


L 


F 


K 


L 


G 


G D 8 I 




S 


L 


H 


L 


Htsl 


Cochliobolus carbonum 2405 


s 


D 


F 


F 


s 


S 


G 


G N 8 M 


A 


A 


I 


A 


L 


CssA 


Tolypocladium niveum 13645 


D 


N 


F 


F 


E 


L 


G 


G H 8 L 


L 


A 




K 


L 



B) Acvl carrier proteins 
*~ Polyketide synthases 
Act-ACP 
Gra-ACP 
Fatty acid synthases 
FAS-ACP 
FAS-ACP 



Streptomyces coeLicolor 33 
Streptomyces violaceoruber 33 



LRFEDI GYD8LALMET 
ITFEELGYDBLALMES 



Escherichia coli 
Saccharomyces 



28 
73 



S F 
Q F 



E D 
K D 



L G A 
LGL 



D T V 
D T V 



consensus 



L G x(HD)S L 



* Sequence daia are derived from: TycA *\ GrsB n . SrfA-B D ; AcvA 79 . CssA His * Act-ACP ,fu . Gra-ACP ,0 \ FAS-ACP 
from E. coli ,0 \ FAS-ACP from yeasi loi . 



Posttranslational phosphopantetheinylation 

Coenzyme A (CoA) 




3\6-ADP 



or v - 




hoIo-PCP 



apo-PCP 



B Acylation of holo-PCP (e.g. amino acylation) 




adenylation domain 



C Fates in peptide synthesis 



Amide bond formation Ester bond formation 




Figure 7. A scheme showing (A) the conversion of the thiolation domain (PCP) from apo to holo protein, through the action of 
a 4'-PP transferase, which directs the nucleophilic attack of the hydroxyl group of the highly conserved PCP-serine to the 
^-phosphate of CoA allowing the transfer of the 4'-PP moiety nto PCP. (B) Acylation of holo-PCP by an amino adenylated substrate 
attached to the adenylation domain. (C) Amide (and ester) bond formation between two amino acid residues (between an amino 
acid and a carboxy acid, respectively) that are activated as acyl-S-Pant thioesters on two adjacent thiolation (PCP) domains. 
Shown is the attack of nitrogen (or oxygen) nucleophiles to yield an amide (ester) bond in peptide (and depsipeptide) biosynthesis. 
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(see Figures 3 and 4), in which the A and T domains 
are separated by the AT-methylation domain. 34 - 35 
Although ACP and PCP proteins are functionally 
similar, they show only a limited degree of overall 
homologies except around the site of cofactor binding 
within the signature sequence LGx(HD)SL (Table 4). 

In integrated peptide synthetases, the activated 
amino acyladenylate substrates on the A domain are 
transferred to the terminal cysteamine thiol group 
of the 4'-PP cofactor (Figures 2 and 7B), which is 
covalently attached to the side chain of the conserved 
serine residue within the signature sequence. 22 ' 24 * 27401 
The essential role of this serine residue in cofactor 
binding and in linking the activated amino acid 
substrates as carboxyl thioester to the 4'-PP pros- 
thetic group has been demonstrated by numerous 
investigations, including site-directed mutagenesis 
and affinity labeling studies. 22 - 23 27 84 101 Recently the 
thiolation domain of the peptide synthetase TycA has 
been biochemically characterized. 24 A region of about 
100 amino acid residues surrounding the site of 
cofactor binding was overproduced in E. coli and 
partially posttranslationally modified from apo to the 
active holo form. The modification was assisted by 
a 4'-PP transferase (EntD, see below) that utilizes 
CoA and the T-domain as substrates. 25 It catalyzes 
the nucleophilic attack of the /^-hydroxy side chain 
of the conserved serine on the pyrophosphate linkage 
of CoA, resulting in the transfer of the 4'-PP moiety 
onto the attacking serine (Figure 7A). This recom- 
binant PCP protein fragment was active in amino 
acylation in the presence of an adenylation domain 
and radiolabeled cognate amino acid (Figure 7B). The 
detection of radiolabeled amino acid covalently at- 
tached to the nonintegrated, separately expressed 
PCP domain of TycA clearly indicated that PCP can 
be acylated in vitro by the A domain. 24 These results 
are a strong evidence for the functional integrity of 
these domains and for the multiple carrier model of 
nonribosomal synthesis (Figure 7C). 

Further studies on posttranslational modification 
of PCP and ACP proteins in vitro using radiolabeled 
CoA led to the discovery of a superfamily of proteins 
that catalyze the conversion of apoproteins to their 
holo forms. 25 Among this group of 4'-PP transferases 
are the gene products encoded by sfp, gsp, and Mi, 
which all are associated with bacterial operons 
encoding peptide synthetases. They utilize CoA as 
a common substrate, and appear to attain specificity 
through protein/protein interactions. For example, 
it has been shown that the E. coli apo ACP is the 
mutual substrate of the E. coli encoded ACPS (ACP 
synthase, a specific 4'-PP transferase for ACP) and 
not a substrate of the EntD protein, a second 4'-PP 
transferase present in E. coli, which was shown to 
be specific for the EntF protein involved in entero- 
bactin synthesis. 25 56 The apo PCP protein (the T 
domain of TycA) was found to be a poor substrate 
for ACPS of E. coli, but an excellent substrate f r Sfp 
protein, which is the 4'-PP transferase associated 
with the surfactin biosynthesis operon of Bacillus 
subtilis. These findings argue for the presence of 4'- 
PP transferases that show a specific protein partner- 
ship, when converting an apoprotein to its holo form. 
Therefore, one would expect that there are additional 
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Table 5. The 4-Phosophopantetheinyl Transferases 



protein 0 


pathway 


organism* s> 


size (aa) 




antibiotic production 




Gsp 


gramicidinS 


B. brevis 


273 


Bli 


bacitracin 


B. licheniformis 


225 


EntD 


enterobactin 


E. coli 


209 






S. typhimurium 


232 






S. austin 


232 






S. flexneri 


209 


Lpa-14 


ituxin 


B subtilis 


224 


Psrf-l 


surfactin 


B. pumilis 


233 


Sfp 


surfactin 


B. subtilis 


224 




anabolic pathways 




ACPS 


fatty acids 


E. coli 


126 


FAS2 


fatty acids 


S. cerevisiae 


1894 






C albicans 


looo 






P. patulum 


1857 






S pom be 


1842 






A. nidulans 


1559 


HI0152 


fatty acids 


H. influenzae 


235 


LYS5 


lysine 


S. cerevisiae 


272 




cellular division 




Hetl 


differentiation 


Anabaena sp. 


237 






Synechchocystis sp. 


246 



° Sequences are available from the GenBank, SwissProt, or 
EMBL databases. 



as yet unidentified 4'-PP transferases specific for each 
biosynthetic system. Moreover, since Sfp, Gsp, and 
other transferases associated with template-directed 
synthesis are not essential proteins for survival of 
the host, one has to predict the presence of homolo- 
gous transferases that are specific for the modifica- 
tion of the essential ACP proteins of fatty acid 
synthesis and other proteins that require the pros- 
thetic 4'-PP group. 25 56 Through refined sequence 
comparisons, which indicated low level similarity 
with the primary structure of ACPS, Lambalot and 
co-workers recognized two conserved sequence motifs 
shared among a group of enzymes, whose genes were 
previously shown to be associated with peptide 
antibiotic production, anabolic pathways (e.g., fatty 
acid synthesis) and cellular division (Table 5). 25 The 
overall similarity of these proposed 4'-PP trans- 
ferases, e.g. Sfp, Gsp, and Bli, with ACPS is only 
about 12-22%, whereas those 4 -PP transferases 
associated with bacterial peptide antibiotic produc- 
tion (Sfp, Gsp and Bli) show more than 30% 
identity. 25 ' 63 " 66106 

C. Condensation Domain 

In contrast to the A and T domains catalyzing 
amino acid activation and thioesterification, virtually 
no biochemical data are available to date about the 
part of modular peptide synthetases referred to as 
the condensation domain (C domain, see Figure 4 and 
Table 3). Its actual function therefore remains 
putative. Nevertheless, the accumulating sequence 
information of different peptide synthetase systems 
suggests that the C domain is responsible for the 
condensation of two amino acids activated on adja- 
cent modules, i.e., catalyzes elongation of the growing 
peptide chain. 6 - 23 - 59 

The C domains are inserted between each consecu- 
tive pair of activating units (which may include 
additional tailoring domains like epimerization and 
AT-methylation) within the polypeptide chains of 
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T-domain 
(PCP) 



T-domain 
(PCP) 




NH 




C-domain 
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T-domain 
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Figure 8. Suggested mechanism for the condensation/elongation reaction in peptide synthesis. Two amino acid residues 
attached as thioesters to adjacent thiolation domains (T-domain or PCP) via the cofactor 4'-PP and the second histidine 
residue conserved within motif C3 (HHxxxDG, see Table 6) of the condensation domain < C-domain > are shown. A nucleophilic 
attack of the incoming amino group on the thioester activated carboxyl group of the preceding amino acid is proposed. 



peptide synthetases. This setup corresponds to the 
basic chemical requirements for the sequential link- 
age of activated amino acids to yield a linear peptide. 
Consequently, the number of C domains found in 
bacterial peptide synthetase systems coincides with 
the number of peptide bonds of the linear intermedi- 
ate (see Figure 3). Since the functions of the A and 
T domains in amino acid activation have largely been 
elucidated, 24 49 the remaining C domains are the ideal 
candidates to catalyze peptide bond formation. It is 
unlikely that the A and T domains could also take 
charge of the elongation reaction, because separate, 
equivalent proteins in other systems (acyl-CoA- 
synthases, ACPs) are known to catalyze only the 
reactions shown to be attributed to the A and T 
domains, respectively. Moreover, as no other es- 
sential chemical reaction except amino acid activation 
and peptide bond formation is required to build up a 
linear peptide chain, no basically different catalytic 
function would be conceivable for the ubiquitous C 
domain, considering that a role as a simple spacer 
between functional domains is unlikely due to its size 
(Figures 3 and 4). 

The condensation domain is about 450 amino acids 
in length. Database searches of this region have not 
revealed any related enzymes that might have had 
a common ancestor with a similar enzymatic activity. 
The occurrence of this domain seems to be restricted 
to the superfamily of peptide synthetases. Within 
this group, the C domains show moderate homology 
to each other. Their distribution in multifunctional 
peptide synthetases seems to follow two simple rules 
(Figure 3): (I) A C domain is always present between 
two adjacent activating units located on the same 
polypeptide (intramolecular amino acid transfer; e.g., 
GrsB, TycB, and TycC, SrfA-A and SrfA-B, AcvA, 
Hts, and CssA). 30 " 36 41 42 (II) When the two consecu- 
tive A domains are not located on the same enzyme 
and thus peptide bond formation has to be achieved 
between amino acids activated on two synthetases, 
the C domain is found at the N-terminus of the amino 
acid-accepting synthetase (intermolecular amino acid 
transfer; e.g., TycA to TycB, SrfA-A to SrfA-B, BacA 
to BacB). 32 - 33 - 41 - 42 

The C domains located at the N-terminus of ac- 
cepting synthetases are less conserved than the 
internal ones, and the core sequences given in Table 
3 are better conserved for internal C domains. 59 



Therefore, it is intriguing to speculate that the 
N-terminal C domains are also necessary for the 
accurate recognition of the preceding synthetase 
(protein/protein interaction); their observed sequence 
variations may be an indication of such a specialized 
recognition reaction. 

Recently, de Crecy-Lagard and co-workers have 
pointed out a possible relationship of the C domain 
to chloramphenicol acetyltransferases (CAT) and 
dihydrolipoyl transacetylases (E2p), a part of the 
pyruvate dehydrogenase multienzyme complex (and 
other 2-oxo acid dehydrogenase complexes). 59 These 
enzymes catalyze the transfer of an acetyl group, 
activated as acetyl-CoA, onto a hydroxy moiety of 
chloramphenicol and an acetyl group bound as a 
thioester on dihydrolipoamide onto CoA, respectively. 
These reactions resemble the fate of amino acyl or 
peptidyl intermediates transferred from their thioester 
linkage on one 4'-phosphopantetheinyl moiety of 
peptide synthetases to the 4'-phosphopantetheinyl 
group of the following module (Figure 8). Although 
the C domains show no overall homology to CAT and 
E2p, the best conserved core motif C3 (HHxxxDG, 
see Tables 3 and 6) is a common feature. The crystal 
structures of CAT and the catalytic domain E2p have 
been solved and exhibit virtually identical topol- 
ogy wa in j n b otn cases tne second histidine of the 
HHxxxDG motif (the first is not conserved in E2p) 
is thought to act as the general base promoting 
nucleophilic attack of the hydroxy moiety of chloram- 
phenicol and of the thiol group of CoA on the carbonyl 
carbon atom of the acetyl thioester. 1 13 114 The histi- 
dine is found in both structures to have an unusual 
conformation with regard to its dihedral angles. This 
conformation allows a hydrogen bond between the 
imidazole nitrogen Nl and the carbonyl oxygen of the 
same amino acid, which has been related to its 
suggested properties as a base. 113 114 The conserva- 
tion of the HHxxxDG motif and the similar reaction 
catalyzed in peptide bond formation may suggest an 
analogous function of the second histidine in nonri- 
bosomal peptide synthesis (Figure 8, see also the 
chapter on the epimerization domain in which the 
HHxxxDG motif is also found and a base is needed 
for the chemical reaction). 59 Studies on the pH 
optimum of nonribosomal peptide synthesis sug- 
gested the possibility of a catalytic histidine residue. 26 
However, although the role of histidine seems to be 
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Table 6. Comparison of the Highly Conserved Core Motif C3 Found in the Putative Condensation Domains as 
Well as Dihydrolipoyl Transacetylases and Chloramphenicol Acetyltransferases 
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* Sequence data are derived from: GrsB SrfA-A CssA B . EniF ,or7 . SnbC M . PvdD * RapP '°\ E2 from 
from B. subtilis E2 from Womo sapiens " p . Cat from £1 co/i and CAT from C. perfringm " ? . 



consistent, its specific assignment to the process of 
elongation requires further evaluation. A mutation 
in which the Asp in the C3 motif (Table 3) was 
converted to Ala within the Val module of surfactin 
synthetase 2 (SrfA-B) resulted in loss of product 
formation, underlining its crucial role in nonriboso- 
mal peptide synthesis. 115 

The presence of a waiting position for the incoming 
amino acid within the C domain has been postulated 
as an alternative to the direct transfer from one 4'- 
PP cofactor to the next. 3 * The mechanism for the first 
peptide bond formation in pristinamycin I, actino- 
mycin D, and enterobactin is in obvious conflict with 
the model outlined in Figure 2, since in these cases 
the first module (SnbA, AcmS I, and EntE) lacks the 
thiolation domain (the binding site for 4'-phos- 
phopantetheine). 38 l,6 ~ n * The concept put forward 
was extended as the general elongation process in 
peptide synthesis, following the corresponding mech- 
anism in polyketide and fatty acid synthesis. 

More biochemical studies, ideally focused on a 
single elongation event, will be required to under- 
stand the general principles of peptide bond forma- 
tion catalyzed by peptide synthetases and the as yet 
undiscovered and still putative role of the condensa- 
tion domain within this process. 

D. Thioesterase as Integrated Domain and 
Distinct Protein 

A region of about 250 amino acid residues located 
to the C-terminal end of bacterial modules that are 
involved in adding the last amino acid to the linear 
peptides (Figure 3 and 4, pink color in following 
modules: ACV-Val, GrsB-Leu, SrfA-C-Leu, TycC- 
Leu, and BacC-Asn) exhibits homology to thio- 
esterases. 29 - 31 " 3341 421,9 120 This region is referred to 
as the thioesterase domain (TE domain). It has been 
found in the same location in the bacterial operons 
encoding multifunctional enzymes for the synthesis 
of the ACV tripeptide, 2931119 - 120 bacitracin, 42 entero- 
bactin, 107 gramicidin S, 32 pyoverdine, 39 surfactin, 33 
and tyrocidine 41 which are of bacterial and fungal 
origin. Due to its location, it is tempting to speculate 
that the TE domain is involved in hydrolytic cleavage 
of the linear peptide products, i.e., termination of 
nonribosomal peptide biosynthesis. 

Such a spatial arrangement of TE domains in 
modular polyketide (e.g., erythromycin systems) 121 



and fatty acid synthases (integrated systems type 
I) 122123 has been shown to be responsible for product 
release. However, things seem to be more compli- 
cated in the polypeptide systems: the TE domain is 
present in systems producing linear (ACV), branched 
via ester bond (surfactin), branched through amide 
bond (bacitracin), and cyclic peptides (gramicidin S, 
tyrocidine). Strictly speaking, a thioesterase function 
would only be required in the case of linear products. 
The cleavage of the thioester linkage of the peptide 
chain attached to the 4'-PP cofactor of the last module 
should be achievable by intramolecular attack of a 
side chain to build branched products (bacitracin), 
or of the amino group of the first amino acid incor- 
porated for manufacturing cyclic products (gramici- 
din S, tyrocidine). The apparent explanation of the 
sequence data would be to postulate a cleaved linear 
intermediate in all cases, which could then be cyclized 
in a special fashion or not. However, no such 
intermediate has ever been described for peptide 
synthetases. Perhaps other, as yet unidentified, 
proteins are responsible for the final shape of the 
product. Alternatively, the TE domain could serve 
another function during product formation. In this 
respect, it is noteworthy that thioesterases and 
acyltransferases share a similar catalytic center 
(Table 7, signature sequence GxSxG). Thus the TE 
domain might actually be an acyl transferase domain. 
Cyclization or branching could then be the result of 
an intramolecular acyl transfer of the linear peptide 
chain. 

Eukaryotic peptide synthetases of which the pri- 
mary structure is known (Htsl, 36 Esynl, 34 and Cs- 
sA; 35 see Figure 3) lack the TE domain. It is striking 
that all their products are cyclic, i.e., cleavage of the 
enzyme might be achieved by intramolecular attack 
of the linear peptide intermediate. 

In analogy to the catalytic triad of thioesterases, a 
conserved aspartatic residue is present within the TE 
domain, whereas a conserved histidine can only be 
found when large gaps in the amino acid alignments 
are allowed (not shown). An in- frame deletion of the 
TE domain of the surfactin synthetase 3 (SrfA-C) 
resulted in blocking of surfactin synthesis. 33 G. 
Turner and co-workers have recently mutated the 
conserved serine residue of the signature sequence 
(GxSxG) to alanine and also deleted the entire TE 
domain of ACV-synthetase of Penicillium chrysoge- 
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Candida albicans 



num to analyze their role in nonribosomal peptide 
synthesis. 120 The drastic reduction of product 
formation observed in both cases underlines the 
importance of the TE domain. 

Distinct genes encoding thioesterases have been 
detected within almost all bacterial peptide syn- 
thetase coding operons (Figure 3). The gene products 
are about 220—340 amino acid residues in length and 
show clear homology to thioesterases involved in fatty 
acid biosynthesis in mammalian cells (see Figure 3 
and Table 7 for the GxSxG signature motif, also 
found in the TE domain). 33 41 - 53 As is the case for the 
integrated thioesterases (TE domain), the actual 
function of the operon associated thioesterases GrsT, 
SrfA-TE, and Tyc-TE (see Figure 3; light pink) 
remains unknown. There is some evidence that these 
thioesterases copurify with peptide synthetases. GrsT, 
the thioesterase of the gramicidin S biosynthesis 
operon, stimulates gramicidin S production in vitro 
to a certain extent. However it has an inhibitory 
effect at higher concentrations. 1131 A knockout mu- 
tant of Srf-TE of the surfactin biosynthesis operon 
results in a 6-fold reduction of surfactin production. 62 
Therefore, it can be speculated that these enzymes 
liberate mischarged peptide synthetases, which are 
blocked by an unspeciflc thioesterification of their 4'- 
PP cofactor. 

V. Modifying Domains 

In addition to the incorporation of a wide variety 
of amino and hydroxy acids for which no ribosome- 
recognizing amino acyl-tRNAs in nature exist, pep- 
tide synthetases can also carry out numerous modi- 
fications including AT-acylations of /3-hydroxy fatty 
acids, Af-methylations, and site-specific epimeriza- 
tions (Figure 1 and Table l). 1 - 2 - 6 7 While iV-acylations 
depend on the action of a nonintegrated acyltrans- 
ferase, the particular domains of peptide synthetases 
catalyzing substrate epimerization and 7V-methyla- 
tion are marked by signature sequence motifs neigh- 
boring the adenylation and thiolation domains (Fig- 
ures 3 and 4; Table 3). These modifying domains in 



peptide synthetases dramatically increase the ver- 
satility and biological activity of nonribosomally 
synthesized peptides. 2 However, although a great 
deal of work has been done to elucidate the mode of 
substrate activation (adenylation and thiolation), the 
enzymatic reactions of substrate alteration are not 
completely understood. 

A. Epimerization Domain 

As concluded from initial work on GrsA and TycA, 
the phenylalanine racemases of Bacillus brevis (Fig- 
ure 3)^22.27,49.84.132 SUD strate epimerization has been 
shown to occur at the thioester stage, with the amino 
acyl-S-Pant enzyme-bound substrate. It was found 
that initiation of D-Phe-L-Pro dipeptide formation 
takes place exclusively with D-Phe, and therefore the 
L-Phe substrate should be epimerized prior to con- 
densation. By contrast, all attempts to detect D-Val 
as an intermediate in biosynthesis of the tripeptide 
d-(L-a-aminoadipyl)-L-cysteinyl-D-valine failed. 60 108 
Similar results were obtained while investigating the 
epimerization reaction of D-Val during the biosyn- 
thesis of actinomycin D. 116 - 133 Therefore, it has been 
assumed that in the latter two cases racemization 
takes place at the peptidyl rather than the amino acyl 
stage. A third example for introduction of D-amino 
acids using a protein template has been found in the 
fungal peptides cyclosporin A and HC-toxin (Figures 
1 and 3). In both peptides D-Ala residues were found 
to be incorporated by amino acid-activating modules 
that are devoid of an epimerase domain; 35 - 36 sub- 
strates are provided in the D configuration, which is 
brought about through the action of nonintegrated 
racemases. 134 

With the exception of the bac operon that encodes 
three bacitracin synthetases, 42 all epimerization do- 
mains of bacterial operons encoding peptide syn- 
thetases were found to be localized at the C-terminal 
end of the corresponding peptide synthetase (Figure 
3) 33.53.54 Th e recent sequence data from the bac 
peron of Bacillus licheniformis 42 unveiled the first 
bacterial examples of internal epimerase domains 
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Figure 9. Amino acid sequence alignment of epimerase domains (E domains) from gramicidin synthetase 1 (GrsA), 53 
surfactin synthetase A (SrfA-A), 34 - 38 ACV-synthetase 30 of Penicillium chrysogenum and HC-toxin synthetase. 37 The locations 
of the conserved core sequences El to E7 are marked by the boxes. 



within the bacitracin synthetase 1 (BacA-Glu) and 
bacitracin synthetase 3 (BacC-Phe and BacC-Asp). 
On the basis of the substrates utilized and th 
location, one could describe four structurally homolo- 
gous types of epimerase domains that are pres nt 
within modular peptide synthetases (s e Figure 3): 
(1) C-terminal located amino acyl-epimerases (e.g., 
GrsA 53 ); (2) C-terminal located peptidyl epimerases 



(e.g., ACV synthetase 29-31 ); (3) internal amino acyl 
epimerases (e.g., Hts I 36 ); and (4) internal peptidyl 
epimerases (e.g., BacA and BacC 42 ). 

Apart from the above-mentioned classification, 
sequence comparisons of epimerization domains of 
peptide synthetases revealed no significant homol- 
ogies either to known amino acid epimerases or to 
Af-acyl racemases (Figure 9 and Table 3). Therefore, 
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one may infer that epimerase domains of peptide 
synthetases may represent a novel class (Figure 8) 
distinct from the well-known pyridoxal phosphate 
(PLP)-dependent racemases (e.g., alanine racemases, 
essential for providing D-Ala for the bacterial cell wall 
biosynthesis) 135-137 and the PLP-independent race- 
mases (e.g., glutamate and proline racemases). 138 " 141 
The latter two classes racemize free amino acids 
exclusively, rather than amino acyl or peptidyl-S- 
Pant enzyme-bound substrates. 

Sequence analysis and alignment studies of several 
epimerization domains highlight at least seven sig- 
nature sequence motifs within a region of about 450 
amino acid residues (Table 3 and Figure 9, E1-E7). 
According to a suggested reaction mechanism for 
epimerization, in which one of these core sequences 
(E2) may be involved, de Crecy-Lagard and co- 
workers have implicated motif E2 (HHxxxDxVSW) 
as a signature sequence for a superfamily of enzymes 
involved in acyl transfer and epimerization (Figure 
10). 59 This group of enzymes may share a similar 
catalytic mechanism based on the acid/base proper- 
ties of the second histidine residue in E2. 591131U In 
fact, this motif is also conserved within the proposed 
condensation domain (see condensation domain and 
Table 3, C3 motif), whose action requires a nucleo- 
philic attack of the incoming acyl N-terminus on the 
activated carbonyl of the preceding amino acyl 
thioester (Figure 8). In analogy, epimerization in- 
volves a proton abstraction and readdition of the Ca 
proton of the amino acyl or peptidyl moiety linked to 
the cofactor 4'-PP (Figure 10). 59 The observed de- 
pendency of template-directed peptide synthesis on 
pH indicates the possible involvement of a histidine 
residue. 26 Although this seems to be consistent with 
the proposed mechanism, it remains to be confirmed 
if a histidine residue would be required for such a 
racemization reaction. 

B. /V-Methyltransferase Domain 

iV-Methylation is another modification of nonribo- 
somally synthesized peptides that significantly con- 
tributes to their biological activity and to peptide 
bond stabilization against proteolytic cleavage (Table 



1). Recently, sequencing of the entire fungal genes 
of cyclosporin A synthetase (cssA) 35 and enniatin 
synthetase (esynl) 34 confirmed that the Af-methyl- 
transferase activity is associated with integral parts 
of the respective multifunctional peptide synthetases 
(Figures 3 and 4). The sequence data revealed a 
novel type of module possessing an insertion of about 
420 amino acids (M domain) between the A and T 
domains. The occurrence of these insertions within 
the amino acid activating modules coincides with the 
number of iV-methylated residues in the correspond- 
ing peptide product (e.g., seven Af-methylated resi- 
dues are present in cyclosporin A and one in enniatin; 
Figures 1 and 3). The insertion contains at least 
three signature core motifs (M1-M3; Figure 4 and 
Table 3), including a glycine-rich sequence Ml 
(VL(ED)xGxGxG), that exhibits significant similarity 
to the common S-adenosylmethionine (SAM) binding 
site of a heterologous class of cosubstrate-dependent 
methyltransferases (Table 8). 

Some characteristics of Af-methyltransferase do- 
mains have been analyzed by overproducing func- 
tional fragments of enniatin synthetase (Figure 
3gj 28.96.150 Analysis of the recombinant proteins 
revealed that L-methylvaline activation can be as- 
signed to a C-terminal 158 kDa fragment of the 
second module. This protein encompasses the A, M ? 
and T domains and can be affinity labeled with 
[ 14 C1SAM, verifying the presence of the methyltrans- 
ferase domain. Further N- and C-terminal deletions 
led to a 65 kDa protein forming the 420 amino acid 
insertion mentioned above. 28 150 UV-induced photo- 
affinity labeling of this deletion mutant indicated the 
localization of the methyltransferase activity in this 
region. 151 These studies also revealed that AT-methy- 
lation occurs at the thioester stage prior to peptide 
bond formation. 28 96 150 151 

Cosubstrate dependence of the methylation reac- 
tion in general and SAM charging of the methylation 
domain in particular have been confirmed by the use 
of potent inhibitors like sinefungin and S-adenosyl- 
homocysteine. 151 Sinefungin acts as a competitive 
inhibitor and totally prevents photolabeling with 
[ U C]SAM, while the noncompetitive inhibitor S- 
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Table 8. SAM-Dependent AT-Methyltransferases: Sequence Alignment around the Highly Conserved 
Cofactor-Binding Motif Ml 



_ Enzyme 

A) Peptide synthetases 



2* 
Eayn 
CssA 



Organism 



Position (aaT 



IS 



Fusarium scirpi 
Tolypocladium niveum 



B) diverse SAM-derxmdent Methvltransferases 

DN A ( Adenine , Hin dill) Haemophilus influenzae 
DNA (Adenine. Ban III) Bacillus aneurinolyticus 
DNA (Cytosine, Bam HI) Bacillus amyloliquefaciens 
rRNA (Adenine) Bacillus anthracis 

tRN A (Uracil ) Escherichia coli 

EryG Saccharopolyspora erythraea 

RapM Streptomyces hygroscopicus 



2083 
2100 
3592 
5063 
6577 
9133 
10625 
13190 

255 
47 

320 
46 

212 
83 

104 



RDVLBIGTGSGHI 
GHVLK IOTGTOMV 
GHVLBVOTOTGKV 
GRVX.SVOTGTGM I 
G H V t. E I GTGTGMV 

ghilbigagtom: 
rpcab iotgtomv 
gkvls iotgtomv 



QIVLDPFAOSOTTLLAA 
IRGX»DPSCGDGELLLSl> 
OVVLDPFGOSOTTFAVS 
DTVX.8LOAGKGAL TTVL 
GDLLBLYCONGNFSLAL 
DEVLDVOFOLOAODFFW 
R TVfcBVQCOMOEC LNFL 



consensus v L(OE)x g x G x G 
Sequence daia are derived from: Esyn w . CssA 13 . DNA-MTR from H. influenzae lo , DNA-MTR from 
B aneurinolyticus ,43 . DNA-MTR from B. amyloliquefaciens *•*, rRNA-MTR from B. anXarocis 14 , 
rRNA-MTR from E. coli 146 , EryG 124 and RapM " f '« 



adenosylhomocysteine only lowers the apparent af- 
finity for the cosubstrate, causing a reduced SAM 
charging even at high concentrations of the inhibitor. 
In any case, a dramatically reduced synthesis of a 
non-methylated peptide product could be observed 
indicating the SAM dependence of N-methylation 
during nonribosomal synthesis of modified peptide 
antibiotics. 

VI. Prospects for the Construction of Hybrid 
Antibiotics 

We have discussed the modular organization of 
multifunctional peptide synthetases, the large en- 
zyme complexes representing the protein templates 
for the biosynthesis of defined peptide products, and 
have shown that they are assembled from multifunc- 
tional building blocks (domains). 1 - 2 5 6 Localization 
and enzymatic properties of these disparate building 
blocks were originally postulated on the basis of 
sequence comparison with enzymes having known 
functions or have been revealed by biochemical 
means. Accordingly, dissection of particular modules 
and biochemical analysis of the separated domains 
shed light on the molecular bricks used for as- 
sembling the conveyor belt (template) required for 
the biosynthesis of a defined peptide product. 24 - 28 - 49 - 55150 
It has been established that the nonribosomal syn- 
thesis of a bioactive peptide is brought about by such 
a protein template that contains the appropriate 
number and correct order of activating units. 6 - 7 - 23 
These advances will (and already have) enable the 
development of techniques for the rational design of 
bioactive peptides 79 and for exploring the potential 
of protein templates in combinatorial synthesis for 
the generation of structural diversity. 24 

As a first attempt, we have recently described the 
reprogramming of a given protein template (Figure 
11a). 79 A programmed alteration within the primary 
structure of a peptide antibiotic could be accom- 
plished by the substitution of an amino acid-activat- 
ing module at the genetic level (Figure 11a). Accord- 
ing to this two-step recombination method, the 
chromosomal target site of a desired biosynthesis 
gene has been marked through a specific double 
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cross-over event with a selectable marker. Subse- 
quently, the disrupted gene was reconstituted by a 
replacement plasmid that delivered an engineered 
hybrid gene into the marked chromosome through a 
second marker exchange reaction. The introduced 
hybrid gene encodes a peptide synthetase with an 
altered substrate specificity that targets amino acid 
substitution into the corresponding position of the 
peptide product. 

Initially, this recombination method was set up for 
reprogramming the surfactin synthetase 3, which 
integrates L-leucine at position 7 in the cyclic lipo- 
peptide antibiotic surfactin (SrfA-C; Figure 1 and 3). 
An integration vector was constructed that contains 
the flanking region of the leucine-activating minimal 
module, strictly speaking the coding fragments of the 
N-terminal condensation domain and the C-terminal 
thioesterase (TE-domain) domain. An in- frame in- 
tegration of coding regions of various A-T modules 
of bacterial and fungal origin between the linkers led 
to the construction of hybrid genes, encoding heter- 
ologous SrfA-C derivatives (C-IA-T]-TE) with altered 
substrate specificities, defined by the incoming aden- 
ylation domains. After delivering the hybrid gene(s) 
into the marked chromosome by homologous recom- 
bination, the surfactin derivatives produced by the 
various B. subtilis strains were extracted from the 
cultured broth and analyzed by infrared spectroscopy 
as well as mass spectrometry. These studies clearly 
confirmed the identity of the novel, engineered lipo- 
peptides derived by targeted domain replacement. In 
order to investigate the influence of amino acid 
substitutions on surfactin hemolytic activity, the 
derivatives of surfactin were investigated for their 
ability to lyse erythrocytes. It was found that 
disrupting the operon resulted in a complete loss of 
activity, whereas the hybrid biosurfactants restored 
that activity. Until now, numerous domain swaps 
(see Figure 3) have been accomplished, 7 152 indicating 
that the stage of rational design for bioactive peptides 
is no longer an illusion. 

Although the method described above represents 
a general comprehensive approach for specific modi- 
fication of a desired peptide, there are a few limita- 
tions one has to recognize that further investigations 
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Figure 11. Proposed strategies of gene manipulations for the production of hybrid (a and b) and type II (c) peptide 
synthetases to generate specific alteration (a and b) and diversity (c) in peptide synthesis using the multienzyme thiotemplate 
mechanism. 



have to conquer. On the one hand, the method 
requires well-defined sequence information about the 
biosynthetic system to be manipulated. Unfortu- 
nately, only a limited number of biosynthetic systems 
are currently characterized at the primary level. 
However, progress in the identification and sequenc- 
ing of peptide synthetase genes from various organ- 
isms has been recently made by taking advantage of 
the strong conservation of signature core motifs in 
the domain structure (see Figure 4 and Table 3). 
These fingerprint regions have allowed the detection 
of peptide synthetase-encoding genes using the PCR 
method. 51 - 52 

However, the structure— function relationships of 
the engineered secondary metabolites are difficult to 
predict. The success of novel peptides depends 
mainly on their biological activity, and it seems 
unlikely that random changes would improve proper- 
ties. Structural modeling tools like computer-aided 
drug design could be a prospect to overcome this 
problem and to stimulate the development of peptide- 
based drugs. 153-155 The emerging structural and 
functional capabilities of nonribosomally synthesized 
peptides will allow us to evaluate whether these 
sources can be used to create new products of 
biological significance. 

A direct approach for a targeted reprogramming 
of peptide synthetases could be available by defining 
the structural basis of substrate specificity. Deter- 
mination of the crystal structure of the first adeny- 
lation domain PheA (see section IV.A) accommodates 
the knowledge about the moieties involved in sub- 
strate binding and in constituting the specificity 
pocket. 75 Combined with the available sequence data 
for several A domains, 29-39 - 41 - 42 this may provide a 
foundation for understanding the structural basis of 
substrate specificity in modular peptide synthetases. 



These results may permit the alteration of certain 
residues within the adenylation domains by site- 
directed mutagenesis to modify the substrate speci- 
ficity. Thus, programmed modifications in the pri- 
mary structure of a given peptide antibiotic might 
be achieved by the substitution of few particular 
amino acid residues (within its adenylation unit) 
instead of exchanging entire modules or domains. 

In general, the concepts presented have focused on 
the modification of natural secondary metabolites 
that are known to possess biological activities. In the 
near future, the field of study may be moving toward 
a complete engineering of a biosynthetic system 
(conveyor belt; Figure lib). 7 

In addition to the reprogramming and possible de 
novo generation of protein templates, a third pos- 
sibility for the production of hybrid bioactive peptides 
would be combinatorial peptide synthesis accom- 
plished by noninte grated peptide synthetases (Figure 
11c). 24 As mentioned above, fatty acid synthases 
(FAS), polyketide synthases (PKS), and peptide syn- 
thetases share a similar mode of product assembly 
and possess a modular arrangement. 43 156-158 Nev- 
ertheless, a distinction can be made between large 
modular enzymes and enzyme complexes composed 
of sets of freely dissociable proteins, which are 
classified as type I and type II enzymes, respectively. 
The architecture of discrete proteins (type II) is only 
appropriate for systems involving multiple repeated 
reaction cycles (e.g. FAS, PKS), whereas the biosyn- 
thesis of a defined peptide essentially depends on the 
presence of a protein template containing the correct 
order and appropriate amount of activating 
modules. 1 - 2 * 5 " 7 All peptide synthetases studied so far 
are exclusively modular enzymes of type I. However, 
we have recently demonstrated that catalytic do- 
mains of peptide synthetases are also able to act as 
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individual enzymes and have shown in trans a 
productive interaction with other distinct domains. 24 
Such a complex corresponds to a functional type II 
FAS or PKS and may provide a tool for some 
combinatorial approaches as previously shown for 
PKS. 43 159 " 161 Construction of artificial type II syn- 
thetases would increase the ability to generate 
manifold peptides with diverse structures (Figure 
11c). 

The increasing number of pathogenic organisms 
that are becoming more and more resistant to tra- 
ditional therapy necessarily requires innovative con- 
cepts to generate novel pharmaceutical^ useful 
drugs. 162 " 165 Because of their enormous structural 
and functional diversity, nonribosomally synthesized 
peptides seem to be privileged to meet these de- 
mands. 7 The (near) future will show how far the 
various approaches will go toward the engineering 
of novel peptides of therapeutic use. 

VII. Conclusions 

Much has been learned about the modular struc- 
ture of peptide synthetases, the multienzymes needed 
for nonribosomal peptides synthesis in bacteria and 
filamentous fungi. These protein templates catalyze 
the successive condensation of amino acids through 
adenylation, thiolation, and transpeptidation reac- 
tions on specific activation units designated modules, 
whose spatial organization defines the order of the 
incorporated residues in the final product. The 
modules, depending on their role in activation and 
modification of the substrate are comprised of several 
domains. Most of these domains seem to act as 
independent catalytic units, although located on a 
single polypeptide chain. For example, the adeny- 
lation domain, the heart of each module, acts inde- 
pendently at the level of substrate recognition and 
activation. For peptide condensation and substrate 
modification, however, specific contacts with other 
domains are necessary. The crystal structure of the 
adenylation domain PheA may act as a prototype for 
other adenylation domains within this superfamily 
of enzymes. From these structural studies, detailed 
information is expected to emerge that may allow 
specific alterations to modify the substrate specificity. 
Biochemical data on the integrity of the thiolation 
domain and its posttranslational modification with 
the cofactor has been recently obtained, strengthen- 
ing the model of the multidomain arrangement of 
peptide synthetases. Other modifying domains in- 
volved in epimerization and JV-methylation have been 
mapped on the basis of limited biochemical studies 
or sequence alignments with enzymes of known 
function. Finally, knowledge of the modular struc- 
ture and domain organization of these multifunc- 
tional enzymes has been used successfully at the 
genetic level to alter the protein template. 

Although a large body of information is now ac- 
cumulating on the structure— function relationships 
of this highly interesting family of multienzymes, 
much of their basic aspects still remain unclear. It 
is not known how transpeptidation occurs and which 
role the condensation domains and the cofactor 4'- 
PP play during this process. Accordingly, it is 
unclear how this interplay precisely controls the 



direction of peptide chain growth. Little is known 
also about the details of epimerization, /V-methyla- 
tion and cyclization reactions and exactly how the 
substrate specificity within the adenylation domain 
is determined. 
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